Evaluate Watsonx.ai models
Prerequisites
- Create a watsonx.ai instance
Instructions
- Write a prompt in prompt lab.
- Save as a prompt template.
- Promote prompt to space
- Deploy
- View in AI Factsheet and Track in AI Use Case
- Evaluate using the payload and/or feedback datasets
- Example dataset:
- [Resume Extraction example](../../assets/datasets/Resume Extraction example.csv)
- [Resume Summarization feedback data example](../../assets/datasets/Resume Summarization feedback data example.csv)
- [Resume Summarization payload data example](../../assets/datasets/Resume Summarization payload data example.csv)
Metrics Interpretation
-
Extraction
- ROUGE scores
- ROUGE scores
-
Generation
- Readability score
- HAP score
-
Summarization
- ROUGE scores
- ROUGE scores